Parallel Discourse Annotations on a Corpus of Short Texts
نویسندگان
چکیده
We present the first corpus of texts annotated with two alternative approaches to discourse structure, Rhetorical Structure Theory (Mann and Thompson, 1988) and Segmented Discourse Representation Theory (Asher and Lascarides, 2003). 112 short argumentative texts have been analyzed according to these two theories. Furthermore, in previous work, the same texts have already been annotated for their argumentation structure, according to the scheme of Peldszus and Stede (2013). This corpus therefore enables studies of correlations between the two accounts of discourse structure, and between discourse and argumentation. We converted the three annotation formats to a common dependency tree format that enables to compare the structures, and we describe some initial findings.
منابع مشابه
Rapid Development of a Corpus with Discourse Annotations using Two-stage Crowdsourcing
We present a novel approach for rapidly developing a corpus with discourse annotations using crowdsourcing. Although discourse annotations typically require much time and cost owing to their complex nature, we realize discourse annotations in an extremely short time while retaining good quality of the annotations by crowdsourcing two annotation subtasks. In fact, our experiment to create a corp...
متن کاملThe Use of Referential Constraints in Structuring Discourse
The quality of discourse structure annotations is negatively influenced by the numerous difficulties that occur in the analysis process. In contrast, referential annotation resources are considerably more reliable, given the high precision of the existent anaphora resolution systems. We present an approach based on the Veins Theory (Cristea, Ide, Romary, 1998), in which successful reference ann...
متن کاملThe DAD Parallel Corpora and their Uses
This paper deals with the uses of the annotations of third person singular neuter pronouns in the DAD parallel and comparable corpora of Danish and Italian texts and spoken data. The annotations contain information about the functions of these pronouns and their uses as abstract anaphora. Abstract anaphora have constructions such as verbal phrases, clauses and discourse segments as antecedents ...
متن کاملImproving Discourse Relation Projection to Build Discourse Annotated Corpora
The naive approach to annotation projection is not effective to project discourse annotations from one language to another because implicit discourse relations are often changed to explicit ones and vice-versa in the translation. In this paper, we propose a novel approach based on the intersection between statistical word-alignment models to identify unsupported discourse annotations. This appr...
متن کاملOrdering Constraints on Discourse Relations
Recent work investigates the ordering constraints discourse relations impose on texts. Wolf and Gibson (2003) generated GraphBank, a discourse-annotated corpus without a prior constraint to tree-structured annotations, showing that naive annotators generate representations containing multiple-parented nodes and relations with crossing arguments. Webber et al. (2003) hypothesize that discourse m...
متن کامل